Goto

Collaborating Authors

 Ondo State


Building Efficient Lightweight CNN Models

arXiv.org Artificial Intelligence

Convolutional Neural Networks (CNNs) are pivotal in image classification tasks due to their robust feature extraction capabilities. However, their high computational and memory requirements pose challenges for deployment in resource-constrained environments. This paper introduces a methodology to construct lightweight CNNs while maintaining competitive accuracy. The approach integrates two stages of training; dual-input-output model and transfer learning with progressive unfreezing. The dual-input-output model train on original and augmented datasets, enhancing robustness. Progressive unfreezing is applied to the unified model to optimize pre-learned features during fine-tuning, enabling faster convergence and improved model accuracy. The methodology was evaluated on three benchmark datasets; handwritten digit MNIST, fashion MNIST, and CIFAR-10. The proposed model achieved a state-of-the-art accuracy of 99% on the handwritten digit MNIST and 89% on fashion MNIST, with only 14,862 parameters and a model size of 0.17 MB. While performance on CIFAR-10 was comparatively lower (65% with less than 20,00 parameters), the results highlight the scalability of this method. The final model demonstrated fast inference times and low latency, making it suitable for real-time applications. Future directions include exploring advanced augmentation techniques, improving architectural scalability for complex datasets, and extending the methodology to tasks beyond classification. This research underscores the potential for creating efficient, scalable, and task-specific CNNs for diverse applications.


Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation

arXiv.org Artificial Intelligence

The reranker and generator are two critical components in the Retrieval-Augmented Generation (i.e., RAG) pipeline, responsible for ranking relevant documents and generating responses. However, due to differences in pre-training data and objectives, there is an inevitable gap between the documents ranked as relevant by the reranker and those required by the generator to support answering the query. To address this gap, we propose RADIO, a novel and practical preference alignment framework with RAtionale DIstillatiOn. Specifically, We first propose a rationale extraction method that leverages the reasoning capabilities of Large Language Models (LLMs) to extract the rationales necessary for answering the query. Subsequently, a rationale-based alignment process is designed to rerank the documents based on the extracted rationales, and fine-tune the reranker to align the preferences. We conduct extensive experiments on two tasks across three datasets to demonstrate the effectiveness of our approach compared to baseline methods. Our code is released online to ease reproduction.


GIS Copilot: Towards an Autonomous GIS Agent for Spatial Analysis

arXiv.org Artificial Intelligence

Recent advancements in Generative AI offer promising capabilities for spatial analysis. Despite their potential, the integration of generative AI with established GIS platforms remains underexplored. In this study, we propose a framework for integrating LLMs directly into existing GIS platforms, using QGIS as an example. Our approach leverages the reasoning and programming capabilities of LLMs to autonomously generate spatial analysis workflows and code through an informed agent that has comprehensive documentation of key GIS tools and parameters. The implementation of this framework resulted in the development of a "GIS Copilot" that allows GIS users to interact with QGIS using natural language commands for spatial analysis. The GIS Copilot was evaluated with over 100 spatial analysis tasks with three complexity levels: basic tasks that require one GIS tool and typically involve one data layer to perform simple operations; intermediate tasks involving multi-step processes with multiple tools, guided by user instructions; and advanced tasks which involve multi-step processes that require multiple tools but not guided by user instructions, necessitating the agent to independently decide on and executes the necessary steps. The evaluation reveals that the GIS Copilot demonstrates strong potential in automating foundational GIS operations, with a high success rate in tool selection and code generation for basic and intermediate tasks, while challenges remain in achieving full autonomy for more complex tasks. This study contributes to the emerging vision of Autonomous GIS, providing a pathway for non-experts to engage with geospatial analysis with minimal prior expertise. While full autonomy is yet to be achieved, the GIS Copilot demonstrates significant potential for simplifying GIS workflows and enhancing decision-making processes.


Accident Impact Prediction based on a deep convolutional and recurrent neural network model

arXiv.org Artificial Intelligence

Traffic accidents pose a significant threat to public safety, resulting in numerous fatalities, injuries, and a substantial economic burden each year. The development of predictive models capable of real-time forecasting of post-accident impact using readily available data can play a crucial role in preventing adverse outcomes and enhancing overall safety. However, existing accident predictive models encounter two main challenges: first, reliance on either costly or non-real-time data, and second the absence of a comprehensive metric to measure post-accident impact accurately. To address these limitations, this study proposes a deep neural network model known as the cascade model. It leverages readily available real-world data from Los Angeles County to predict post-accident impacts. The model consists of two components: Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). The LSTM model captures temporal patterns, while the CNN extracts patterns from the sparse accident dataset. Furthermore, an external traffic congestion dataset is incorporated to derive a new feature called the "accident impact" factor, which quantifies the influence of an accident on surrounding traffic flow. Extensive experiments were conducted to demonstrate the effectiveness of the proposed hybrid machine learning method in predicting the post-accident impact compared to state-of-the-art baselines. The results reveal a higher precision in predicting minimal impacts (i.e., cases with no reported accidents) and a higher recall in predicting more significant impacts (i.e., cases with reported accidents).


Optimized Quality of Service prediction in FSO Links over South Africa using Ensemble Learning

arXiv.org Machine Learning

Fibre optic communication system is expected to increase exponentially in terms of application due to the numerous advantages over copper wires. The optical network evolution presents several advantages such as over long-distance, low-power requirement, higher carrying capacity and high bandwidth among others Such network bandwidth surpasses methods of transmission that include copper cables and microwaves. Despite these benefits, free-space optical communications are severely impacted by harsh weather situations like mist, precipitation, blizzard, fume, soil, and drizzle debris in the atmosphere, all of which have an impact on the Quality of Service (QoS) rendered by the systems. The primary goal of this article is to optimize the QoS using the ensemble learning models Random Forest, ADaBoost Regression, Stacking Regression, Gradient Boost Regression, and Multilayer Neural Network. To accomplish the stated goal, meteorological data, visibility, wind speed, and altitude were obtained from the South Africa Weather Services archive during a ten-year period (2010 to 2019) at four different locations: Polokwane, Kimberley, Bloemfontein, and George. We estimated the data rate, power received, fog-induced attenuation, bit error rate and power penalty using the collected and processed data. The RMSE and R-squared values of the model across all the study locations, Polokwane, Kimberley, Bloemfontein, and George, are 0.0073 and 0.9951, 0.0065 and 0.9998, 0.0060 and 0.9941, and 0.0032 and 0.9906, respectively. The result showed that using ensemble learning techniques in transmission modeling can significantly enhance service quality and meet customer service level agreements and ensemble method was successful in efficiently optimizing the signal to noise ratio, which in turn enhanced the QoS at the point of reception.


AlzhiNet: Traversing from 2DCNN to 3DCNN, Towards Early Detection and Diagnosis of Alzheimer's Disease

arXiv.org Artificial Intelligence

Alzheimer's disease (AD) is a progressive neurodegenerative disorder with increasing prevalence among the aging population, necessitating early and accurate diagnosis for effective disease management. In this study, we present a novel hybrid deep learning framework that integrates both 2D Convolutional Neural Networks (2D-CNN) and 3D Convolutional Neural Networks (3D-CNN), along with a custom loss function and volumetric data augmentation, to enhance feature extraction and improve classification performance in AD diagnosis. According to extensive experiments, AlzhiNet outperforms standalone 2D and 3D models, highlighting the importance of combining these complementary representations of data. The depth and quality of 3D volumes derived from the augmented 2D slices also significantly influence the model's performance. The results indicate that carefully selecting weighting factors in hybrid predictions is imperative for achieving optimal results. Our framework has been validated on the Magnetic Resonance Imaging (MRI) from Kaggle and MIRIAD datasets, obtaining accuracies of 98.9% and 99.99%, respectively, with an AUC of 100%. Furthermore, AlzhiNet was studied under a variety of perturbation scenarios on the Alzheimer's Kaggle dataset, including Gaussian noise, brightness, contrast, salt and pepper noise, color jitter, and occlusion. The results obtained show that AlzhiNet is more robust to perturbations than ResNet-18, making it an excellent choice for real-world applications. This approach represents a promising advancement in the early diagnosis and treatment planning for Alzheimer's disease.


Voices Unheard: NLP Resources and Models for Yor\`ub\'a Regional Dialects

arXiv.org Artificial Intelligence

Yor\`ub\'a an African language with roughly 47 million speakers encompasses a continuum with several dialects. Recent efforts to develop NLP technologies for African languages have focused on their standard dialects, resulting in disparities for dialects and varieties for which there are little to no resources or tools. We take steps towards bridging this gap by introducing a new high-quality parallel text and speech corpus YOR\`ULECT across three domains and four regional Yor\`ub\'a dialects. To develop this corpus, we engaged native speakers, travelling to communities where these dialects are spoken, to collect text and speech data. Using our newly created corpus, we conducted extensive experiments on (text) machine translation, automatic speech recognition, and speech-to-text translation. Our results reveal substantial performance disparities between standard Yor\`ub\'a and the other dialects across all tasks. However, we also show that with dialect-adaptive finetuning, we are able to narrow this gap. We believe our dataset and experimental analysis will contribute greatly to developing NLP tools for Yor\`ub\'a and its dialects, and potentially for other African languages, by improving our understanding of existing challenges and offering a high-quality dataset for further development. We release YOR\`ULECT dataset and models publicly under an open license.


A Survey on Deep Learning and State-of-the-art Applications

arXiv.org Artificial Intelligence

Deep learning, a branch of artificial intelligence, is a computational model that uses multiple layers of interconnected units (neurons) to learn intricate patterns and representations directly from raw input data. Empowered by this learning capability, it has become a powerful tool for solving complex problems and is the core driver of many groundbreaking technologies and innovations. Building a deep learning model is a challenging task due to the algorithm`s complexity and the dynamic nature of real-world problems. Several studies have reviewed deep learning concepts and applications. However, the studies mostly focused on the types of deep learning models and convolutional neural network architectures, offering limited coverage of the state-of-the-art of deep learning models and their applications in solving complex problems across different domains. Therefore, motivated by the limitations, this study aims to comprehensively review the state-of-the-art deep learning models in computer vision, natural language processing, time series analysis and pervasive computing. We highlight the key features of the models and their effectiveness in solving the problems within each domain. Furthermore, this study presents the fundamentals of deep learning, various deep learning model types and prominent convolutional neural network architectures. Finally, challenges and future directions in deep learning research are discussed to offer a broader perspective for future researchers.


A Robust Machine Learning Approach for Path Loss Prediction in 5G Networks with Nested Cross Validation

arXiv.org Artificial Intelligence

The design and deployment of fifth-generation (5G) wireless networks pose significant challenges due to the increasing number of wireless devices. Path loss has a landmark importance in network performance optimization, and accurate prediction of the path loss, which characterizes the attenuation of signal power during transmission, is critical for effective network planning, coverage estimation, and optimization. In this sense, we utilize machine learning (ML) methods, which overcome conventional path loss prediction models drawbacks, for path loss prediction in a 5G network system to facilitate more accurate network planning, resource optimization, and performance improvement in wireless communication systems. To this end, we utilize a novel approach, nested cross validation scheme, with ML to prevent overfitting, thereby getting better generalization error and stable results for ML deployment. First, we acquire a publicly available dataset obtained through a comprehensive measurement campaign conducted in an urban macro-cell scenario located in Beijing, China. The dataset includes crucial information such as longitude, latitude, elevation, altitude, clutter height, and distance, which are utilized as essential features to predict the path loss in the 5G network system. We deploy Support Vector Regression (SVR), CatBoost Regression (CBR), eXtreme Gradient Boosting Regression (XGBR), Artificial Neural Network (ANN), and Random Forest (RF) methods to predict the path loss, and compare the prediction results in terms of Mean Absolute Error (MAE) and Mean Square Error (MSE). As per obtained results, XGBR outperforms the rest of the methods. It outperforms CBR with a slight performance differences by 0.4 % and 1 % in terms of MAE and MSE metrics, respectively. On the other hand, it outperforms the rest of the methods with clear performance differences.


clustering an african hairstyle dataset using pca and k-means

arXiv.org Artificial Intelligence

The adoption of digital transformation was not expressed in building an African face shape classifier. In this paper, an approach is presented that uses k-means to classify African women images. African women rely on beauty standards recommendations, personal preference, or the newest trends in hairstyles to decide on the appropriate hairstyle for them. In this paper, an approach is presented that uses K-means clustering to classify African women's images. In order to identify potential facial clusters, Haarcascade is used for feature-based training, and K-means clustering is applied for image classification.